perm filename CONCLU[0,BGB]15 blob
sn#116835 filedate 1974-08-30 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00010 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 {⊂C<NαRESULTS AND CONCLUSIONS.λ30P116I325,0JCFA} SECTION 10.
C00006 00003 As a design theory, the present work can be compared with
C00010 00004
C00014 00005 ⊂10.2 Critique: Errors and Omissions.⊃
C00019 00006 ⊂10.3 Suggestions for Future Work.⊃
C00022 00007 The application of geometric modeling to vision and robotics
C00027 00008
C00031 00009
C00035 00010 ⊂10.4 Conclusions.⊃
C00042 ENDMK
C⊗;
{⊂C;<N;αRESULTS AND CONCLUSIONS.;λ30;P116;I325,0;JCFA} SECTION 10.
{JCFD} RESULTS AND CONCLUSIONS.
{λ10;W250;JAFA}
10.1 Results: Accomplishments and Original Contributions.
10.2 Critique: Errors and Ommissions.
10.3 Suggestions for Future Work.
10.4 Conclusion.
{λ30;W0;I700,0;JUFA}
⊂10.1 Results: Accomplishments and Original Contributions.⊃
As a regular feature in a Ph.D. dessertation, it is required
to state explicitly what has been accomplished and what is original.
Some of what has been accomplished is itemized in box 10.1; with the
so called <original contributions> marked by asterisks. Each of the
accomplishments has been elaborated in the indicated chapter.
{|;λ10;T150,165,900;JA;FA}
BOX 10.1{JC} ACCOMPLISHMENTS AND ORIGINAL CONTRIBUTIONS.
0. The Geometric Feedback Vision Theory Chapter 6.
* 1. The Winged Edge Polyhedron Representation Chapter 2.
* 2. The Euler Primitives for Polyhedron Construction Chapter 3.
3. The Iron Triangle Camera Locus Algorithm Chapter 9.
* 4. The OCCULT hidden line elimination algorithm Chapter 4.
* 5. The Polygon Nesting Algorithm Chapter 7.
* 6. The Polygon Dekinking Method Chapter 7.
7. The Polygon Segmenting Method Chapter 7.
8. The Polygon Comparing Method Chapter 8.
* 9. Silhouette Cone Intersection Chapters 5 and 9.
{|;T-1;λ30;JUFA}
As a whole, the system described in this thesis is the third
of its kind, succeeding the systems of Roberts (1963) and Falk
(1970). Although, the modeling routines of the present system are
considerably more sophisticated than were those of its predecessors;
improvement in the visual analysis routines is less dramatic and more
open to question. The present image analysis differs from the earlier
systems in that emphasis is placed on the use of multiple
images for the sake of parallax depth perception and in that several
spatially connected image representations are combined (contour image,
mosaic image and raster image) to preserve the structure of the scene
through feature extraction rather than following the earlier
paradigm of extracting features from the image piecemeal and
attempting to splice them together afterwards.
As a design theory, the present work can be compared with
earlier work by comparing the block diagrams. The charcteristically
circular feedback vision mandala like diagrams appear in (Falk)
Figure 4-7, page 78; (Grape) Figure 12.1, page 242; (Tenenbaum)
Figure 1.13, page 43; as well as in this work Figure 6.1, page 70.
The feedback mandala is conspicuously absent in the best of the
stimulus-response visual parsing work, (Waltz), as well as in
statistical recognition work, (Duda and Hart). The important ideas
depicted in the feedback vision mandala are the duality of the
simulated and physical worlds, the duality of description and
verification, the dualism of camera and body locus solving, and the
dual opposing flows of predicted and perceived images along a
hieracry of commensurate abstractions. Tenenbaum's figure illustrates
the basic feedback loop in the immediate vicinity of the visual
sensor. The diagrams of Falk and Grape are similar mirrors of the
overall system design of the Stanford Hand/Eye group (1969 to 1973)
under the leadership of Professor Jerome Feldman. The two diagrams
depict an array of relevant boxes (camera solver, edge finder, world
modeler and so on) all sending messages to each other under the
benign direction of a box labeled "general strategist".
Among the elements composing the GEOMED/CRE system, the most
original accomplishment is the winged edge polyhedron representation.
In computer graphics models are based on face perimeter lists (or
arrays), with an awareness that more topological relations exist but
with no realization that a substantial improvement in surface
topology modeling is feasible using approximately the same resources.
Another accomplishment, the Euler primitives was based on a
constructive proof of the Euler relation from (Coxeter 61). Other
graphics systems lack this level of abstraction that falls between the
level of node/link operations and operations with solids. The Euler
primitives were useful in implementing OCCULT and GEOMED sweep and
glue operations, but they were less useful in implementing the body
intersector, BIN.
A pre-computer form of the Iron Triangle camera solving
method appears in a paper by Berkay (59). Berkay described the method
as an analog procedure to be performed with paper, ruler and afew
other photogrammetric hand tools. (The existence of this paper was
pointed out to me by Irwin Sobel).
The original accomplishment of the hidden line eliminator,
OCCULT lies in its unification of several methods and in its
exploitation of object and image coherence made possible by the Euler
primitives and the Winged Edge Representation.
The last five accomplishments listed in box 10.1 are related
to vision. The nesting and dekinking problems have been stated and
solved by others, the present solutions are original only in
technical detail: the nesting for its use of memory to avoid a
N-squared number of compares and the dekinking for its achievement of
good results with almost no effort. The recursive polygon
segmentation and the polygon compare idea were accomplishments that
were compatible with the contour image approach but are not
necessarily original ideas.
⊂10.2 Critique: Errors and Omissions.⊃
The major weakness in the existing modeling system is that it
lacks overall unity - the modeling and image anaylsis are not yet
sufficiently well integrated. The second major weakness is that the
essential subsystems involving comparing, locus solving and
recognition are still in a primitive condition. Consequently, an
unambiguous objective demonstation of the relevance of 3-D modeling
to computer vision is missing; the particular demonstration which I
had in mind was to have a robot vehicle drive outside around the
laboratory visually servoing along a trajectory given in advance.
In the course of this work, technical failures have included
the attempt to use Euler primitives to implement body intersection,
the attempt to bundle contour images into mosiac images, as well as
attempts to make the Euler kill primitives logically air tight
without time consuming model checking. However, the worst errors are
of the form of misallocated effort; more time might have been spent
on image analysis and less on image synthesis and so forth. The
research suffers from not having a criterion for deciding which
objectives deserves the most immediate effort.
A final barrier to progress in computer vision is the
inadequacy of the hardware. It may be true that "It is a poor workman
who blames his tools"; but for me the greatest source of personal
frustration has been the television cameras, the cart and the
turntable. At Stanford, these devices have not been implemented or
maintained with sufficient care to make them convenient to use.
⊂10.3 Suggestions for Future Work.⊃
{|λ9;JA}
Box 10.2 {λ7;JAJC} SUGGESTIONS FOR FUTURE WORK.
~SPATIAL MODELING WORK.~
1. Combination Geometric Models - Converters.
2. Cellular Space Modeling - Tetrahedral Simplices.
3. Spatial Simulation: Collision Avoidance Problem.
4. Higher Dimensionality, 4-D GEOMED.
~SIMULATIONS.~
5. Mechanical Simulation.
6. Creature Simulations.
7. Geometric Task Planning.
8. Geometric/Semantics Modeling.
~MATHEMATICALLY ORIENTED PROBLEMS.~
9. The Manifold Resurfacing Problem.
10. The Curved Patchs Problem.
11. Prove the Correctness of a Hidden Line Eliminator.
~GET RICH QUICK APPLICATIONS.~
12. Automatic Machine Shop.
13. Animation for Entertainment Industry.
~SYSTEMS SOFTWARE AND VISION HARDWARE WORK.~
14. Better Loader and/or Incremental Assembler.
15. Better Cameras.
16. Image Oriented Number Crunching Computer Hardware.
17. Better Robot Vehicles.
{|λ30;JUFA}
The application of geometric modeling to vision and robotics
raises numerous interesting ideas and problems, box 10.3.
Future development of <Combination Geometric Models> may
begin by writing converters between geometric representations. For
example, there is a need to convert polyhedra into spine cross
sections, space points into polyhedra, contour maps into faceted
surfaces and so on. Extramural combination models include
<Geometric/Semantic Modeling> which will be needed to cover the gulf
between Minsky's (1974) notion of a visual frame-system (e.g. the
expectation of a room interior) and a geometric prediction of the
features to be found in the image. Although the Minsky Frame-System
theory does not explicitly reveal the crucial interface between
numerical geometric modeling and symbolic abstractions, that nexus is
a central part of the frame-system idea.
The <Cellular Space Modeling> idea is that both space and
objects should be modeled using a space filling tesselation of cells;
perhaps using the tetrahedral 3-simplex. The difficulty lies in
getting the Euclidean primitives to update the geometry and
topology of empty space as an object moves and rotates. The rewards
might include an elegant approach to collision avoidance problems
in vehicle navigation and arm trajectory planning. Other approaches
to <spatial simulation> and <collision avoidance problems> that might
be pursued include the use of simulated viewpoint to see obstacle free
trajectories by means of hidden line elimination, this method is
suggested in (Sutherland 69).
In several recent Stanford dissertations, (Falk, Yakimofsky,
Grape, and so on) the authors conclude with the prediction that
their essentially 2-D techniques can readily be extended to 3-D in
future work. In my turn, I seriously wish to propose that my
essentially 3-D techniques can be extended to 4-D. The resulting
models could be applied to Regge Calculus for computing the general
relativistic geometric models of such systems as two or three
colliding blackholes or on a less cosmic level a 4-D GEOMED could be
of service for planning sequences of arm manipulations viewing time
as a spatial dimension. Collision of 3-D polyhdera moving in
time can be described as a static intersection of 4-D polytopes.
Geometric modeling is also applicable to future work in
simulation. <Mechanical Simulation> involves computing the Newtonian
mechanics of everyday objects, problems which are immediately
approachable from a GEOMED foundation include simulated object
collision, statics, and pseudo friction. For example, consider what
is needed to predict the outcome of setting one more block at a given
place on an existing tower or of throwing one block into a tower of
other blocks. <Geometric Task Planning> problems include the old A.I.
favorite of block stacking as well as the newer research problems
related to industrial assembly. Existing solutions to geometric tasks
are notoriously restricted, for example I know of no blocks stacking
program that handles arbitrary rotations, all blocks to date are
piled on the square.
Although, it has been recognized (early and often) that the
programming of numerically controled machine tools should be
automated, the actual implementation of a system that builds
artifacts directly from a geometric model still lies in the future.
As a start, someone at any of the research labs with an general
purpose manipulator could begin by carving models out of soap or
other soft material with a rotating cutting tool.
Advanced mechanical simulations as well as <Animation for
Entertainment> quickly run into the problem of <Creature Simulation>
- given a multilegged bug, what control program is required to make
the bug walk through a building. Barring the darkness of war, it is
likely that the greatest potential future users of robotic simulation
will not be found in government, universities, or manufacturing
industries but rather in the entertainment industry. When it becomes
economically feasible to create realistic (and surrealistic)
animation by computer graphics, great progress will be made in
simulating visual reality and in representing mundane situations in a
computer.
Theoretical work in geometric modeling will continue to
pursue curved representations. Two problems that I would especially
like to see solved involve fitting curved surfaces to form a smooth
object, (a manifold), as well as resurfacing an existing manifold
representation. Both problems I beleive are more a question of
automatic segmentation rather than automatic smoothing. It is easy to
fit functions to facial patches of an object, it is hard to subdivide
an object into the proper set of patches. In terms of analysis of
algorithms and the mathematical theory of computation, the one
geometric algorithm that seems most ripe for future quantative study
and logical analysis is the hidden line elimination process. There is
a wealth of different techniques to be compared and the inputs and
outputs seem to be sufficiently well defined for formal axiomatizing.
Finally progress in computer vision and geometric modeling
requires progress in systems software and computer systems. In my
opinion, recent university based research in programming languages is
over concentrated in very high level language theory and automatic
programming. Future language and systems work should include
developing an incremental loader, assembler, debugger and editor that
can handle algebraic expressions, block structure, node/link
storage notation as well as unvarnished machine instructions.
Although special purpose image processing hardware has earned a bad
reputation (starting with the Illiac-III); in my opinion a real
vision system will be composed of a large array of computer like
elements (4096 by 4096) that pipeline a stream of images into
structured image representations. The perceived images are then
compared with predicted images and a detailed 3-D model is altered or
constructed in real time (24 images per second) using a small number
of computers (32 or less) which by the standards of our day (1974)
would be very large and very fast (ten megawords main memory and ten
megahertz instruction execution). Assuming the continuation of
civilization with a growing technology over the next one hundred to a
thousand years, developments in Computer Vision and Artificial
Intellegence could lead to robots, androids and cyborgs which will be
able to see, to think and to feel conscious.
⊂10.4 Conclusions.⊃
The particular technical conclusions of this work include the
methods, system designs and data structures for geometric modeling
which have already been elaborated. Based on the details, one could
make such generalized observations as that: recursive windowing is a
good technique for spatial sorting, simple geometric representations
fall into space oriented and object oriented classes, the essence of
an object representation is its coherence under various operators and
that the power of a vision system might be enhanced by application of
3-D modeling techniques. However in closing, I would like to draw
three rather more general conclusions, conclusions which by contrast
to the technical ones might be construed as scientific conclusions.
1. ~<The Nature of Perception>~. Perception is essential to
intelligence as it is the process which converts external sensations
into internal thoughts. There are two kinds of simple perception
systems: stimulus-response and prediction-correction feedback;
together they explain perception.{Q}
2. ~<The Necessity to Experiment>~. Robotic hardware is
essential to Artificial Intelligence as an experimental science. It
is misleading to study only theoretical robotics of plausible
abstractions, mathematics, puzzles, games and simulations. The real
physical world is the best test of adaptive general intelligence. The
complexity and subtlety of real world situations, even of a situation
as seemingly finite as a digital television picture, can not be
anticipated from a philosopher's armchair or from a programmer's
console.
3. ~<The Necessity to Simulate Visual Reality>~. Modeling is
essential to prediction-correction feedback perception. Although
simulated robot environments should not be used in place of the
external physical reality, such environmental simulations are an
essential part of a robot's internal mental reality. In the
particular case of vision, geometric models should be easy to adapt
to the basic mental abilities of present day computer hardware. To
conclude, perception requires two worlds one that is the external
physical reality and the other which is the internal mental reality.
{H2;X0.6;I∂400,630;*RUNNER;}